Genetic Algorithm for Document Clustering with Simultaneous and Ranked Mutation

نویسنده

  • K. Premalatha
چکیده

Clustering is a division of data into groups of similar objects. Each group, called cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. The clustering algorithm attempts to find natural groups of components, based on some similarity. Traditional clustering algorithms will search only a small sub-set of all possible clustering and consequently, there is no guarantee that the solution found will be optimal. This paper presents the document clustering based on Genetic algorithm with Simultaneous mutation operator and Ranked mutation rate. The mutation operation is significant to the success of genetic algorithms since it expands the search directions and avoids convergence to local optima. In each stage of the genetic process in a problem, may involve aptly different mutation operators for best results. In simultaneous mutation the genetic algorithm concurrently uses several mutation operators in producing the next generation. The mutation ratio of each operator changes according to assessment from the respective offspring it produces. In ranked scheme, it adapts the mutation rate on the chromosome based on the fitness rank of the earlier population. Experiments results are examined with document corpus. It demonstrates that the proposed algorithm statistically outperforms the Simple GA and K-Means.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

یک روش ترکیبی خوشه بندی مبتنی بر الگوریتم ژنتیک با استفاده از عملگر های جدید تغییر

  The clustering problem under the criterion of minimum sum of squares is a non-convex and non-linear program, which possesses many locally optimal values, resulting that its solution often being stuck at locally optimal values and therefore cannot converge to global optima solution. In this paper, we introduce several new variation operators for the proposed hybrid genetic algorithm for the cl...

متن کامل

Airfoil Shape Optimization with Adaptive Mutation Genetic Algorithm

An efficient method for scattering Genetic Algorithm (GA) individuals in the design space is proposed to accelerate airfoil shape optimization. The method used here is based on the variation of the mutation rate for each gene of the chromosomes by taking feedback from the current population. An adaptive method for airfoil shape parameterization is also applied and its impact on the optimum desi...

متن کامل

Effective Fuzzy Ontology Based Distributed Document Using Non-Dominated Ranked Genetic Algorithm

The increase in the number of documents has aggravated the difficulty of classifying those documents according to specific needs. Clustering analysis in a distributed environment is a thrust area in artificial intelligence and data mining. Its fundamental task is to utilize characters to compute the degree of related corresponding relationship between objects and to accomplish automatic classif...

متن کامل

خوشه‌بندی خودکار داده‌ها با بهره‌گیری از الگوریتم رقابت استعماری بهبودیافته

Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...

متن کامل

STRUCTURAL OPTIMIZATION USING A MUTATION-BASED GENETIC ALGORITHM

The present study is an attempt to propose a mutation-based real-coded genetic algorithm (MBRCGA) for sizing and layout optimization of planar and spatial truss structures. The Gaussian mutation operator is used to create the reproduction operators. An adaptive tournament selection mechanism in combination with adaptive Gaussian mutation operators are proposed to achieve an effective search in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009